Google 43 rules of machine learning
- Don't be afraid to launch a product without machine learning
- To create a useful ML tool it requires a large amount of data to outperform simpler heuristic algorithms
- Don't shift to ML until you have data
- First, design and implement metrics
- Track as much in the current system so that it is easier to build ML systems on top
- Choose machine learning over complex heuristics
- Once the heuristic is complex, they become very difficult to maintain and update over ML pipelines
- Test the infrastructure independently from the machine learning
- Encapsulate the learning parts of the system so the rest can be tested
- Make sure the models from the training environment performs the same as the models from the production environment
- Be careful about dropped data when copying pipelines
- When copying an existing pipeline, it may drop certain data but that might not be applicable to the new pipeline
- Turn heuristics into features, or handle them externally
- Use existing rules and heuristics in your ML
- They probably contain intuition about the problem that you don't want to throw away